Overview
Brought to you by YData
Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 101 |
| Missing cells | 18 |
| Missing cells (%) | 1.8% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 42.2 KiB |
| Average record size in memory | 427.7 B |
Variable types
| Text | 3 |
|---|---|
| Categorical | 3 |
| Numeric | 2 |
| DateTime | 1 |
| Boolean | 1 |
department is highly overall correlated with first_name and 2 other fields | High correlation |
first_name is highly overall correlated with department and 2 other fields | High correlation |
is_active is highly overall correlated with department and 3 other fields | High correlation |
last_name is highly overall correlated with department and 2 other fields | High correlation |
salary is highly overall correlated with is_active | High correlation |
is_active is highly imbalanced (50.3%) | Imbalance |
email has 7 (6.9%) missing values | Missing |
phone has 6 (5.9%) missing values | Missing |
age has 5 (5.0%) missing values | Missing |
department is uniformly distributed | Uniform |
customer_id has unique values | Unique |
hire_date has unique values | Unique |
Reproduction
| Analysis started | 2025-08-28 07:44:50.628425 |
|---|---|
| Analysis finished | 2025-08-28 07:44:53.974615 |
| Duration | 3.35 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
customer_id
Text
Unique 
| Distinct | 101 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Unique
| Unique | 101 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | CUST_01000 |
|---|---|
| 2nd row | CUST_01001 |
| 3rd row | CUST_01002 |
| 4th row | CUST_01003 |
| 5th row | CUST_01004 |
| Value | Count | Frequency (%) |
| cust_01012 | 1 | 1.0% |
| cust_01100 | 1 | 1.0% |
| cust_01000 | 1 | 1.0% |
| cust_01001 | 1 | 1.0% |
| cust_01002 | 1 | 1.0% |
| cust_01093 | 1 | 1.0% |
| cust_01094 | 1 | 1.0% |
| cust_01095 | 1 | 1.0% |
| cust_01096 | 1 | 1.0% |
| cust_01097 | 1 | 1.0% |
| Other values (91) | 91 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 223 | |
| 1 | 122 | |
| C | 101 | |
| S | 101 | |
| U | 101 | |
| _ | 101 | |
| T | 101 | |
| 2 | 20 | 2.0% |
| 3 | 20 | 2.0% |
| 8 | 20 | 2.0% |
| Other values (5) | 100 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1010 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 223 | |
| 1 | 122 | |
| C | 101 | |
| S | 101 | |
| U | 101 | |
| _ | 101 | |
| T | 101 | |
| 2 | 20 | 2.0% |
| 3 | 20 | 2.0% |
| 8 | 20 | 2.0% |
| Other values (5) | 100 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1010 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 223 | |
| 1 | 122 | |
| C | 101 | |
| S | 101 | |
| U | 101 | |
| _ | 101 | |
| T | 101 | |
| 2 | 20 | 2.0% |
| 3 | 20 | 2.0% |
| 8 | 20 | 2.0% |
| Other values (5) | 100 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1010 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 223 | |
| 1 | 122 | |
| C | 101 | |
| S | 101 | |
| U | 101 | |
| _ | 101 | |
| T | 101 | |
| 2 | 20 | 2.0% |
| 3 | 20 | 2.0% |
| 8 | 20 | 2.0% |
| Other values (5) | 100 |
first_name
Categorical
High correlation 
| Distinct | 11 |
|---|---|
| Distinct (%) | 10.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.4 KiB |
| John | |
|---|---|
| Jane | |
| Bob | |
| Alice | |
| Charlie | |
| Other values (6) |
Length
| Max length | 7 |
|---|---|
| Median length | 5 |
| Mean length | 4.5940594 |
| Min length | 3 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | John |
|---|---|
| 2nd row | Jane |
| 3rd row | Bob |
| 4th row | Alice |
| 5th row | Charlie |
Common Values
| Value | Count | Frequency (%) |
| John | 10 | |
| Jane | 10 | |
| Bob | 10 | |
| Alice | 10 | |
| Charlie | 10 | |
| Diana | 10 | |
| Eve | 10 | |
| Frank | 10 | |
| Grace | 10 | |
| Henry | 10 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| john | 10 | |
| jane | 10 | |
| bob | 10 | |
| alice | 10 | |
| charlie | 10 | |
| diana | 10 | |
| eve | 10 | |
| frank | 10 | |
| grace | 10 | |
| henry | 10 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 61 | |
| e | 60 | |
| n | 50 | |
| r | 40 | 8.6% |
| i | 30 | 6.5% |
| h | 20 | 4.3% |
| o | 20 | 4.3% |
| c | 20 | 4.3% |
| J | 20 | 4.3% |
| l | 20 | 4.3% |
| Other values (13) | 123 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 464 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 61 | |
| e | 60 | |
| n | 50 | |
| r | 40 | 8.6% |
| i | 30 | 6.5% |
| h | 20 | 4.3% |
| o | 20 | 4.3% |
| c | 20 | 4.3% |
| J | 20 | 4.3% |
| l | 20 | 4.3% |
| Other values (13) | 123 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 464 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 61 | |
| e | 60 | |
| n | 50 | |
| r | 40 | 8.6% |
| i | 30 | 6.5% |
| h | 20 | 4.3% |
| o | 20 | 4.3% |
| c | 20 | 4.3% |
| J | 20 | 4.3% |
| l | 20 | 4.3% |
| Other values (13) | 123 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 464 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 61 | |
| e | 60 | |
| n | 50 | |
| r | 40 | 8.6% |
| i | 30 | 6.5% |
| h | 20 | 4.3% |
| o | 20 | 4.3% |
| c | 20 | 4.3% |
| J | 20 | 4.3% |
| l | 20 | 4.3% |
| Other values (13) | 123 |
last_name
Categorical
High correlation 
| Distinct | 11 |
|---|---|
| Distinct (%) | 10.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.6 KiB |
| Smith | |
|---|---|
| Johnson | |
| Williams | |
| Brown | |
| Jones | |
| Other values (6) |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 6.3960396 |
| Min length | 5 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | Smith |
|---|---|
| 2nd row | Johnson |
| 3rd row | Williams |
| 4th row | Brown |
| 5th row | Jones |
Common Values
| Value | Count | Frequency (%) |
| Smith | 10 | |
| Johnson | 10 | |
| Williams | 10 | |
| Brown | 10 | |
| Jones | 10 | |
| Garcia | 10 | |
| Miller | 10 | |
| Davis | 10 | |
| Rodriguez | 10 | |
| Martinez | 10 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| smith | 10 | |
| johnson | 10 | |
| williams | 10 | |
| brown | 10 | |
| jones | 10 | |
| garcia | 10 | |
| miller | 10 | |
| davis | 10 | |
| rodriguez | 10 | |
| martinez | 10 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 81 | |
| o | 51 | 7.9% |
| n | 51 | 7.9% |
| a | 50 | 7.7% |
| r | 50 | 7.7% |
| s | 41 | 6.3% |
| l | 41 | 6.3% |
| e | 40 | 6.2% |
| J | 20 | 3.1% |
| m | 20 | 3.1% |
| Other values (16) | 201 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 646 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 81 | |
| o | 51 | 7.9% |
| n | 51 | 7.9% |
| a | 50 | 7.7% |
| r | 50 | 7.7% |
| s | 41 | 6.3% |
| l | 41 | 6.3% |
| e | 40 | 6.2% |
| J | 20 | 3.1% |
| m | 20 | 3.1% |
| Other values (16) | 201 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 646 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 81 | |
| o | 51 | 7.9% |
| n | 51 | 7.9% |
| a | 50 | 7.7% |
| r | 50 | 7.7% |
| s | 41 | 6.3% |
| l | 41 | 6.3% |
| e | 40 | 6.2% |
| J | 20 | 3.1% |
| m | 20 | 3.1% |
| Other values (16) | 201 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 646 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 81 | |
| o | 51 | 7.9% |
| n | 51 | 7.9% |
| a | 50 | 7.7% |
| r | 50 | 7.7% |
| s | 41 | 6.3% |
| l | 41 | 6.3% |
| e | 40 | 6.2% |
| J | 20 | 3.1% |
| m | 20 | 3.1% |
| Other values (16) | 201 |
email
Text
Missing 
| Distinct | 94 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 7 |
| Missing (%) | 6.9% |
| Memory size | 6.5 KiB |
Length
| Max length | 19 |
|---|---|
| Median length | 18 |
| Mean length | 17.914894 |
| Min length | 17 |
Unique
| Unique | 94 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | user1@example.com |
|---|---|
| 2nd row | user2@example.com |
| 3rd row | user3@example.com |
| 4th row | user4@example.com |
| 5th row | user5@example.com |
| Value | Count | Frequency (%) |
| user37@example.com | 1 | 1.1% |
| user73@example.com | 1 | 1.1% |
| user72@example.com | 1 | 1.1% |
| user71@example.com | 1 | 1.1% |
| user70@example.com | 1 | 1.1% |
| user69@example.com | 1 | 1.1% |
| user40@example.com | 1 | 1.1% |
| user39@example.com | 1 | 1.1% |
| user38@example.com | 1 | 1.1% |
| user24@example.com | 1 | 1.1% |
| Other values (84) | 84 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 282 | |
| m | 188 | |
| u | 94 | 5.6% |
| r | 94 | 5.6% |
| @ | 94 | 5.6% |
| x | 94 | 5.6% |
| s | 94 | 5.6% |
| l | 94 | 5.6% |
| . | 94 | 5.6% |
| a | 94 | 5.6% |
| Other values (13) | 462 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1684 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 282 | |
| m | 188 | |
| u | 94 | 5.6% |
| r | 94 | 5.6% |
| @ | 94 | 5.6% |
| x | 94 | 5.6% |
| s | 94 | 5.6% |
| l | 94 | 5.6% |
| . | 94 | 5.6% |
| a | 94 | 5.6% |
| Other values (13) | 462 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1684 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 282 | |
| m | 188 | |
| u | 94 | 5.6% |
| r | 94 | 5.6% |
| @ | 94 | 5.6% |
| x | 94 | 5.6% |
| s | 94 | 5.6% |
| l | 94 | 5.6% |
| . | 94 | 5.6% |
| a | 94 | 5.6% |
| Other values (13) | 462 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1684 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 282 | |
| m | 188 | |
| u | 94 | 5.6% |
| r | 94 | 5.6% |
| @ | 94 | 5.6% |
| x | 94 | 5.6% |
| s | 94 | 5.6% |
| l | 94 | 5.6% |
| . | 94 | 5.6% |
| a | 94 | 5.6% |
| Other values (13) | 462 |
phone
Text
Missing 
| Distinct | 95 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 6 |
| Missing (%) | 5.9% |
| Memory size | 6.3 KiB |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 15 |
| Min length | 15 |
Unique
| Unique | 95 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | +1-555-202-1860 |
|---|---|
| 2nd row | +1-555-370-6191 |
| 3rd row | +1-555-800-6734 |
| 4th row | +1-555-221-1466 |
| 5th row | +1-555-314-5426 |
| Value | Count | Frequency (%) |
| 1-555-921-7873 | 1 | 1.1% |
| 1-555-653-8035 | 1 | 1.1% |
| 1-555-492-1206 | 1 | 1.1% |
| 1-555-497-7938 | 1 | 1.1% |
| 1-555-602-7910 | 1 | 1.1% |
| 1-555-104-8385 | 1 | 1.1% |
| 1-555-561-4840 | 1 | 1.1% |
| 1-555-369-8629 | 1 | 1.1% |
| 1-555-301-1995 | 1 | 1.1% |
| 1-555-655-1161 | 1 | 1.1% |
| Other values (85) | 85 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 348 | |
| - | 285 | |
| 1 | 161 | |
| + | 95 | 6.7% |
| 6 | 83 | 5.8% |
| 8 | 76 | 5.3% |
| 4 | 72 | 5.1% |
| 3 | 68 | 4.8% |
| 9 | 68 | 4.8% |
| 2 | 62 | 4.4% |
| Other values (2) | 107 | 7.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1425 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 5 | 348 | |
| - | 285 | |
| 1 | 161 | |
| + | 95 | 6.7% |
| 6 | 83 | 5.8% |
| 8 | 76 | 5.3% |
| 4 | 72 | 5.1% |
| 3 | 68 | 4.8% |
| 9 | 68 | 4.8% |
| 2 | 62 | 4.4% |
| Other values (2) | 107 | 7.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1425 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 5 | 348 | |
| - | 285 | |
| 1 | 161 | |
| + | 95 | 6.7% |
| 6 | 83 | 5.8% |
| 8 | 76 | 5.3% |
| 4 | 72 | 5.1% |
| 3 | 68 | 4.8% |
| 9 | 68 | 4.8% |
| 2 | 62 | 4.4% |
| Other values (2) | 107 | 7.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1425 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 5 | 348 | |
| - | 285 | |
| 1 | 161 | |
| + | 95 | 6.7% |
| 6 | 83 | 5.8% |
| 8 | 76 | 5.3% |
| 4 | 72 | 5.1% |
| 3 | 68 | 4.8% |
| 9 | 68 | 4.8% |
| 2 | 62 | 4.4% |
| Other values (2) | 107 | 7.5% |
age
Real number (ℝ)
Missing 
| Distinct | 49 |
|---|---|
| Distinct (%) | 51.0% |
| Missing | 5 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.09375 |
| Minimum | 18 |
|---|---|
| Maximum | 79 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 940.0 B |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 33.75 |
| median | 49 |
| Q3 | 66.25 |
| 95-th percentile | 77 |
| Maximum | 79 |
| Range | 61 |
| Interquartile range (IQR) | 32.5 |
Descriptive statistics
| Standard deviation | 19.273149 |
|---|---|
| Coefficient of variation (CV) | 0.39257847 |
| Kurtosis | -1.2700166 |
| Mean | 49.09375 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | -0.063984086 |
| Sum | 4713 |
| Variance | 371.45428 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=49)
| Value | Count | Frequency (%) |
| 66 | 5 | 5.0% |
| 69 | 4 | 4.0% |
| 19 | 4 | 4.0% |
| 49 | 4 | 4.0% |
| 76 | 4 | 4.0% |
| 77 | 4 | 4.0% |
| 20 | 3 | 3.0% |
| 36 | 3 | 3.0% |
| 50 | 3 | 3.0% |
| 43 | 3 | 3.0% |
| Other values (39) | 59 | |
| (Missing) | 5 | 5.0% |
| Value | Count | Frequency (%) |
| 18 | 2 | |
| 19 | 4 | |
| 20 | 3 | |
| 21 | 1 | 1.0% |
| 22 | 1 | 1.0% |
| 23 | 3 | |
| 24 | 2 | |
| 26 | 1 | 1.0% |
| 28 | 3 | |
| 29 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 79 | 2 | |
| 77 | 4 | |
| 76 | 4 | |
| 75 | 1 | 1.0% |
| 74 | 2 | |
| 73 | 2 | |
| 72 | 2 | |
| 71 | 1 | 1.0% |
| 70 | 1 | 1.0% |
| 69 | 4 |
salary
Real number (ℝ)
High correlation 
| Distinct | 98 |
|---|---|
| Distinct (%) | 97.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82866.842 |
| Minimum | -999 |
|---|---|
| Maximum | 149181 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 4 |
| Negative (%) | 4.0% |
| Memory size | 940.0 B |
Quantile statistics
| Minimum | -999 |
|---|---|
| 5-th percentile | 32200 |
| Q1 | 51357 |
| median | 78404 |
| Q3 | 112989 |
| 95-th percentile | 146336 |
| Maximum | 149181 |
| Range | 150180 |
| Interquartile range (IQR) | 61632 |
Descriptive statistics
| Standard deviation | 39147.261 |
|---|---|
| Coefficient of variation (CV) | 0.47241165 |
| Kurtosis | -0.74935224 |
| Mean | 82866.842 |
| Median Absolute Deviation (MAD) | 29101 |
| Skewness | -0.044370351 |
| Sum | 8369551 |
| Variance | 1.5325081 × 109 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -999 | 4 | 4.0% |
| 106213 | 1 | 1.0% |
| 104290 | 1 | 1.0% |
| 33436 | 1 | 1.0% |
| 77333 | 1 | 1.0% |
| 109909 | 1 | 1.0% |
| 146381 | 1 | 1.0% |
| 36893 | 1 | 1.0% |
| 47429 | 1 | 1.0% |
| 49830 | 1 | 1.0% |
| Other values (88) | 88 |
| Value | Count | Frequency (%) |
| -999 | 4 | |
| 32049 | 1 | 1.0% |
| 32200 | 1 | 1.0% |
| 32557 | 1 | 1.0% |
| 32869 | 1 | 1.0% |
| 33436 | 1 | 1.0% |
| 33987 | 1 | 1.0% |
| 34499 | 1 | 1.0% |
| 35539 | 1 | 1.0% |
| 35600 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 149181 | 1 | |
| 149121 | 1 | |
| 147858 | 1 | |
| 147796 | 1 | |
| 146381 | 1 | |
| 146336 | 1 | |
| 142893 | 1 | |
| 142476 | 1 | |
| 142296 | 1 | |
| 139616 | 1 |
department
Categorical
High correlation  Uniform 
| Distinct | 5 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.6 KiB |
| Sales | |
|---|---|
| Marketing | |
| Engineering | |
| HR | |
| Finance |
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 6.7821782 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Sales |
|---|---|
| 2nd row | Marketing |
| 3rd row | Engineering |
| 4th row | HR |
| 5th row | Finance |
Common Values
| Value | Count | Frequency (%) |
| Sales | 21 | |
| Marketing | 20 | |
| Engineering | 20 | |
| HR | 20 | |
| Finance | 20 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| sales | 21 | |
| marketing | 20 | |
| engineering | 20 | |
| hr | 20 | |
| finance | 20 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 120 | |
| e | 101 | |
| i | 80 | |
| a | 61 | |
| g | 60 | |
| r | 40 | 5.8% |
| S | 21 | 3.1% |
| s | 21 | 3.1% |
| l | 21 | 3.1% |
| t | 20 | 2.9% |
| Other values (7) | 140 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 685 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 120 | |
| e | 101 | |
| i | 80 | |
| a | 61 | |
| g | 60 | |
| r | 40 | 5.8% |
| S | 21 | 3.1% |
| s | 21 | 3.1% |
| l | 21 | 3.1% |
| t | 20 | 2.9% |
| Other values (7) | 140 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 685 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 120 | |
| e | 101 | |
| i | 80 | |
| a | 61 | |
| g | 60 | |
| r | 40 | 5.8% |
| S | 21 | 3.1% |
| s | 21 | 3.1% |
| l | 21 | 3.1% |
| t | 20 | 2.9% |
| Other values (7) | 140 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 685 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 120 | |
| e | 101 | |
| i | 80 | |
| a | 61 | |
| g | 60 | |
| r | 40 | 5.8% |
| S | 21 | 3.1% |
| s | 21 | 3.1% |
| l | 21 | 3.1% |
| t | 20 | 2.9% |
| Other values (7) | 140 |
hire_date
Date
Unique 
| Distinct | 101 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 940.0 B |
| Minimum | 2016-03-04 00:00:00 |
|---|---|
| Maximum | 2025-08-24 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Histogram with fixed size bins (bins=50)
is_active
Boolean
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 233.0 B |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 90 | |
| False | 11 | 10.9% |
Interactions
Correlations
| age | department | first_name | is_active | last_name | salary | |
|---|---|---|---|---|---|---|
| age | 1.000 | 0.000 | 0.119 | 0.000 | 0.119 | 0.092 |
| department | 0.000 | 1.000 | 0.968 | 0.656 | 0.968 | 0.161 |
| first_name | 0.119 | 0.968 | 1.000 | 0.953 | 1.000 | 0.227 |
| is_active | 0.000 | 0.656 | 0.953 | 1.000 | 0.953 | 0.552 |
| last_name | 0.119 | 0.968 | 1.000 | 0.953 | 1.000 | 0.227 |
| salary | 0.092 | 0.161 | 0.227 | 0.552 | 0.227 | 1.000 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
Sample
| customer_id | first_name | last_name | phone | age | salary | department | hire_date | is_active | ||
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | CUST_01000 | John | Smith | NaN | NaN | NaN | -999 | Sales | 2019-10-13 | False |
| 1 | CUST_01001 | Jane | Johnson | user1@example.com | +1-555-202-1860 | 66.0 | 32049 | Marketing | 2020-10-25 | True |
| 2 | CUST_01002 | Bob | Williams | user2@example.com | +1-555-370-6191 | 75.0 | 61616 | Engineering | 2022-10-03 | True |
| 3 | CUST_01003 | Alice | Brown | user3@example.com | +1-555-800-6734 | 69.0 | 133727 | HR | 2019-04-15 | True |
| 4 | CUST_01004 | Charlie | Jones | user4@example.com | +1-555-221-1466 | 29.0 | 142893 | Finance | 2016-04-17 | True |
| 5 | CUST_01005 | Diana | Garcia | user5@example.com | +1-555-314-5426 | 79.0 | 50932 | Sales | 2018-09-10 | True |
| 6 | CUST_01006 | Eve | Miller | user6@example.com | +1-555-558-9322 | 56.0 | 147796 | Marketing | 2024-10-31 | True |
| 7 | CUST_01007 | Frank | Davis | user7@example.com | +1-555-761-1769 | 19.0 | 59855 | Engineering | 2025-03-01 | True |
| 8 | CUST_01008 | Grace | Rodriguez | user8@example.com | +1-555-443-7949 | 20.0 | 91434 | HR | 2023-12-31 | True |
| 9 | CUST_01009 | Henry | Martinez | user9@example.com | +1-555-485-6311 | 66.0 | 102694 | Finance | 2016-12-23 | True |
| customer_id | first_name | last_name | phone | age | salary | department | hire_date | is_active | ||
|---|---|---|---|---|---|---|---|---|---|---|
| 91 | CUST_01091 | Jane | Johnson | user91@example.com | +1-555-484-2306 | 24.0 | 132946 | Marketing | 2019-05-01 | True |
| 92 | CUST_01092 | Bob | Williams | user92@example.com | +1-555-732-6864 | 77.0 | 139616 | Engineering | 2017-02-07 | True |
| 93 | CUST_01093 | Alice | Brown | user93@example.com | +1-555-358-8526 | 33.0 | 135983 | HR | 2021-04-16 | True |
| 94 | CUST_01094 | Charlie | Jones | user94@example.com | +1-555-809-6575 | 43.0 | 103744 | Finance | 2017-04-21 | True |
| 95 | CUST_01095 | Diana | Garcia | user95@example.com | +1-555-510-5413 | 65.0 | 86491 | Sales | 2025-08-24 | True |
| 96 | CUST_01096 | Eve | Miller | user96@example.com | +1-555-776-1663 | 74.0 | 48589 | Marketing | 2022-07-29 | True |
| 97 | CUST_01097 | Frank | Davis | user97@example.com | +1-555-926-2495 | 69.0 | 73484 | Engineering | 2022-04-27 | True |
| 98 | CUST_01098 | Grace | Rodriguez | user98@example.com | +1-555-332-4763 | 77.0 | 112989 | HR | 2023-07-17 | True |
| 99 | CUST_01099 | Henry | Martinez | user99@example.com | +1-555-212-2853 | 66.0 | 66212 | Finance | 2023-04-05 | True |
| 100 | CUST_01100 | Emma | Wilson | user100@example.com | NaN | NaN | 73525 | Sales | 2022-12-11 | False |